Search CORE

45 research outputs found

STAR-Fusion: Fast and Accurate Fusion Transcript Detection from RNA-Seq

Author: Bankapur Asma
Doak Thomas
Dobin Alex
Ganote Carrie
Gingeras Thomas
Haas Brian
Li Bo
Pochet Nathalie
Regev Aviv
Stransky Nicolas
Sun Jing
Tickle Timothy
Wu Catherine
Yang Xiao
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 24/03/2017
Field of study

Motivation Fusion genes created by genomic rearrangements can be potent drivers of tumorigenesis. However, accurate identification of functionally fusion genes from genomic sequencing requires whole genome sequencing, since exonic sequencing alone is often insufficient. Transcriptome sequencing provides a direct, highly effective alternative for capturing molecular evidence of expressed fusions in the precision medicine pipeline, but current methods tend to be inefficient or insufficiently accurate, lacking in sensitivity or predicting large numbers of false positives. Here, we describe STAR-Fusion, a method that is both fast and accurate in identifying fusion transcripts from RNA-Seq data. Results We benchmarked STAR-Fusion’s fusion detection accuracy using both simulated and genuine Illumina paired-end RNA-Seq data, and show that it has superior performance compared to popular alternative fusion detection methods. Availability and implementation STAR-Fusion is implemented in Perl, freely available as open source software at http://star-fusion.github.io, and supported on Linux

Cold Spring Harbor Laboratory Institutional Repository

Evidence for Transcript Networks Composed of Chimeric RNAs in Human Cells

Author: A Dobin
A Pombo
Adam Frankish
AJ Walhout
Alex Dobin
Alexandre Reymond
Alfonso Valencia
Bryan R. Lajoie
CA Maher
Catherine Ucla
Chenwei Lin
Christelle Borel
CJ McManus
Cédric Howald
D Gordon
DA Jackson
David Martin
E Birney
E Gilboa
EL Sonnhammer
Erica Dumais
F Denoeud
F Ozsolak
G Parra
H Kaessmann
H Li
HM Temin
Ian Bell
J Cocquet
J Dostie
J Harrow
J Houseley
Jacqueline Chrast
JE Collins
Jennifer Harrow
JL Thorvaldsen
Job Dekker
John Stamatoyannopoulos
Jonathan M. Mudge
Jorg Drenkow
Josep Lluís Gelpí
Julien Lagarde
K Kannan
K Salehi-Ashtiani
Kourosh Salehi-Ashtiani
LG Wilming
Lila Ghamsari
M Krzywinski
MA Quail
Marc Vidal
MI Krzywinski
Michael L. Tress
MJ Fullwood
Modesto Orozco
Nynke L. van Berkum
P Akiva
P Kapranov
P Unneberg
Paolo Ribeca
Philipp Kapranov
Philippe Batut
R Durbin
R Khanin
Roderic Guigó
RR Bowman
Ryan R. Murray
S Djebali
S Rozen
Sarah Djebali
SF Altschul
SM Searle
Stylianos E. Antonarakis
SW Roy
Sylvain Foissac
Thomas Preiss
Thomas R. Gingeras
Tim Hubbard
TR Gingeras
Vincent Lacroix
WJ Kent
X Li
X Wu
Xinping Yang
Y Qu
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

Serveur académique lausannois

HAL Descartes

eScholarship@UMMS

UPF Digital Repository

ProdInra

Hal-Diderot

FigShare

Public Library of Science (PLOS)

Crossref

Harvard University - DASH

INRIA a CCSD electronic archive server

PubMed Central

King's Research Portal

Diposit Digital de la Universitat de Barcelona

HAL-Rennes 1

Enhanced Transcriptome Maps from Multiple Mouse Tissues Reveal Evolutionary Constraint in Gene Expression for Thousands of Genes

Author: Balasubramanian Suganthi
Beer Michael
Breschi Alessandra
Bussotti Giovanni
Davis Carrie
Djebali Sarah
Dobin Alex
Drenkow Jorg
Fastuca Meagan
Gerstein Mark
Gingeras Thomas
Guigo Roderic
Harmanci Arif
Lagarde Julien
Monlong Jean
Notredame Cedric
Pei Baikang
Pervouchine Dmitri
Prieto Barja Pablo
See Lei-Hoon
Tanzer Andrea
Wang Huaien
Zaleski Chris
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 30/10/2014
Field of study

We characterized by RNA-seq the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles obtained in human cell lines reveals substantial conservation of transcriptional programs, and uncovers a distinct class of genes with levels of expression across cell types and species, that have been constrained early in vertebrate evolution. This core set of genes capture a substantial and constant fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with strong and conserved epigenetic marking, as well as to a characteristic post-transcriptional regulatory program in which sub-cellular localization and alternative splicing play comparatively large roles

Cold Spring Harbor Laboratory Institutional Repository

NaviSE: superenhancer navigator integrating epigenomics signal algebra

Author: A Dobin
A Subramanian
AJ Müller-Molina
Alex M. Ascensión
Ander Izeta
AR Afzal
B Langmead
C Tyner
CJ Schoenherr
D Hnisz
D Hnisz
D Hnisz
D Szklarczyk
DS Castro
EY Chen
F Luo
H Heyn
H Li
J Kang
J Kim
J Lovén
J Phuchareon
J Toubiana
J Yu
JA Chong
JC Villaescusa
JG Kang
K Jaioun
K Kumano
K Takahashi
LS Lim
M Rebhan
Marcos J. Araúzo-Bravo
MC Hu
Mikel Arrospide-Elgarresta
MS Jin
PL Luu
R Edgar
RC Adam
S Ortega
S Pott
V Parravicini
WA Whyte
WP Lee
Y Ohkubo
Y Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Landscape of transcription in human cells

Eukaryotic cells make many types of primary and processed RNAs that are found either in specific sub-cellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic sub-cellular localizations are also poorly understood. Since RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell’s regulatory capabilities are focused on its synthesis, processing, transport, modifications and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations taken together prompt to a redefinition of the concept of a gene

Carolina Digital Repository

A community challenge to evaluate RNA-seq, fusion detection, and isoform quantification methods for cancer discovery

The accurate identification and quantitation of RNA isoforms present in the cancer transcriptome is key for analyses ranging from the inference of the impacts of somatic variants to pathway analysis to biomarker development and subtype discovery. The ICGC-TCGA DREAM Somatic Mutation Calling in RNA (SMC-RNA) challenge was a crowd-sourced effort to benchmark methods for RNA isoform quantification and fusion detection from bulk cancer RNA sequencing (RNA-seq) data. It concluded in 2018 with a comparison of 77 fusion detection entries and 65 isoform quantification entries on 51 synthetic tumors and 32 cell lines with spiked-in fusion constructs. We report the entries used to build this benchmark, the leaderboard results, and the experimental features associated with the accurate prediction of RNA species. This challenge required submissions to be in the form of containerized workflows, meaning each of the entries described is easily reusable through CWL and Docker containers at https://github.com/SMC-RNA-challenge. A record of this paper's transparent peer review process is included in the supplemental information

Cold Spring Harbor Laboratory Institutional Repository

eScholarship - University of California

Comparative analysis of the transcriptome across distant species

Author: Adam Frankish
Alex Dobin
Alexandre Reymond
Ali Mortazavi
Anastasia Samsonova
Andrea Tanzer
Ann Hammonds
Anurag Sethi
Arif O. Harmanci
AT Kalinka
Baikang Pei
Benjamin W. Booth
BR Graveley
Brent Ewing
Brenton R. Graveley
Brian Oliver
Burak H. Alver
Carrie A. Davis
Chao Cheng
Chao Di
Chau Huynh
Chenghai Xue
Chris Zaleski
Cristina Sisu
Cédric Howald
D Brawand
Daifeng Wang
David M. Miller
DF Simola
Dionna Kasper
Dmitri Pervouchine
Elise A. Feingold
Eric Lai
Erik Ladewig
Felix Schlesinger
Frank J. Slack
Gang Fang
Garrett Robinson
Gary I. Saunders
Gemma May
Gennifer Merrihew
Guanjun Gao
Guilin Wang
Haiyan Huang
Henry Zheng
Huaien Wang
J Merkin
J Reichardt
James B. Brown
Jen Harrow
Jiayu Wen
Jing Leng
Jingyi Jessica Li
JJ Li
JM Stuart
Joel Rozowsky
Jorg Drenkow
Julien Lagarde
Kathie L. Watkins
Kejia Wen
Kenneth H. Wan
Kevin Yip
Kimberly Bell
KK Yan
Koon-Kiu Yan
LaDeana Hillier
Li Yang
Long Hu
Lucy Cherbas
M Levin
M Talerico
Marcus H. Stoiber
Mark B. Gerstein
Masaomi Kato
Max E. Boeck
MB Gerstein
Megan Fastuca
Michael J. Pazin
Michael MacCoss
Michael O. Duff
modENCODE Consortium
Nathan P. Boley
NL Barbosa-Morais
Norbert Perrimon
Owen A. Thompson
Peter Cherbas
Peter J. Bickel
Peter J. Good
Peter J. Park
Pnina Strasbourger
R Karlić
Rabi Murad
Raymond Auerbach
Rebecca McWhirter
Robert R. Kitchen
Robert Waterston
Roderic Guigó
Roger A. Hoskins
Roger P. Alexander
S Djebali
S Kirkpatrick
Sara Olson
Sarah Djebali
Sonali Jha
Steven E. Brenner
Susan E. Celniker
T Domazet-Lošo
Thomas C. Kaufman
Thomas R. Gingeras
Tim J. P. Hubbard
Valerie Reinke
William C. Spencer
Yan Zhang
Zhi Lu
ZJ Lu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters

Crossref

Cold Spring Harbor Laboratory Institutional Repository

University of Birmingham Research Portal

Harvard University - DASH

Serveur académique lausannois

PubMed Central

eScholarship - University of California

UPF Digital Repository

King's Research Portal

Brunel University Research Archive

Improved reference genome of Aedes aegypti informs arbovirus vector control

Author: A Dobin
A Fontaine
AA Enayati
AB Hall
AB Hall
AB Hall
AC English
Adam M. Phillippy
AK Jones
Albin Fontaine
Alex R. Hastie
Alexander S. Raikhel
Alistair C. Darby
Allison M. Weakley
Andrea Gloria-Soria
Andrew K. Jones
Anopheles gambiae
Arina D. Omer
Atashi Sharma
Aviva Presser Aiden
B Langmead
B Langmead
B Negre
Benjamin J. Matthews
Benjamin R. Evans
BJ Matthews
BK Peterson
BL Apostol
BM Gilchrist
BR Evans
Bradley J. White
C Bass
Carlos A. Brito-Sierra
Carolyn S. McBride
Catherine A. Hill
CL Moyes
Corey L. Campbell
CS Chin
CS Chin
D Charlesworth
D Duboule
Daniel E. Neafsey
David B. Sattelle
DE Neafsey
DJ Begun
DW Galbraith
E Frichot
E Goulielmaki
EB Lewis
EE Hare
Erez Lieberman Aiden
Eric Cox
F Hahne
F Ortelli
Frederick A. Partridge
G Benson
G Rašić
G Rašić
GAH McClelland
Gareth D. Weedall
Gareth J. Lycett
GN Artemov
Gordana Rašić
GR Margarido
H Li
Han Cao
HM Robertson
Hugh M. Robertson
Igor Antoshechkin
Igor Filipović
Igor V. Sharakhov
J Catchen
J Krumsiek
J. Spencer Johnston
Jacob E. Crawford
JD Buenrostro
Jeffrey R. Powell
Jill Muehling
JM Catchen
Joe Turner
Jonas Korlach
Joyce Lee
Karla Saavedra-Rodriguez
KW Broman
L Alphey
Leslie B. Vosshall
Li Zhao
Louis Lambrechts
LV Jiménez
Margaret Herre
Maria V. Sharakhova
ME Newton
Melissa Laird Smith
Michael R. Murphy
MJ Chaisson
MJ Gorman
MM Riehle
MV Sharakhova
N Lumjuan
NC Durand
Noah H. Rose
O Dudchenko
Olga Dudchenko
Omar S. Akbari
OS Akbari
P George
P Juneja
P Juneja
Paul Peluso
R Patro
Raissa G. G. Kay
Richard Hall
Richard S. Mann
RM Waterhouse
S Bhatt
S Guindon
S Heinz
S Merabet
S Merabet
Saki Chan
Sanjit S. Batra
Sara N. Mitchell
Sarah B. Kingan
Sergey Koren
Seth N. Redmond
SF Altschul
Shruti Sharan
SJ Thomas
SK Denny
SM Kiełbasa
Sourav Roy
SS Rao
Steven D. Buckingham
T Fansiri
TD Wu
Terence D. Murphy
Thanyalak Fansiri
V Nene
VA Timoshevskiy
VA Timoshevskiy
VA Timoshevskiy
Vamsi K. Kodali
Vidya Ramasamy
Vinita S. Joardar
WC Black IV
William C. Black
William J. Glassford
Y Benjamini
Y Liao
Y Liao
Yang Wu
Zhijian Tu
Zhilei Zhao
ZN Adelman
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector

LJMU Research Online (Liverpool John Moores University)

University of Liverpool Repository

eScholarship - University of California

HAL-Pasteur

Oxford Brookes University: RADAR

University of Queensland eSpace

Additional file 1 of Gene-specific patterns of expression variation across organs and species

Author: Alessandra Breschi (3448616)
Alex Dobin (189598)
Carrie Davis (3448619)
Dmitri Pervouchine (3448622)
Jesse Gillis (174666)
Roderic GuigĂł (3448643)
Sarah Djebali (189529)
Thomas Gingeras (13717)
Publication venue
Publication date
Field of study

Figures S1âS13. File with all supplementary figures, from S1 to S13. (PDF 871 kb

FigShare

Gene-specific patterns of expression variation across organs and species

Author: A Clauset
A Pohl
A Siepel
Alessandra Breschi
Alex Dobin
B Lenhard
BY Liao
Carrie A. Davis
D Brawand
D Welter
Dmitri D. Pervouchine
ET Chan
F Yue
H Wu
I Yanai
J Lonsdale
J Merkin
Jesse Gillis
JS Amberger
M Fontes
M Melé
MB Gerstein
MN McCall
N Pishesha
NL Barbosa-Morais
PH Sudmant
Roderic Guigó
S Lin
Sarah Djebali
The FANTOM Consortium
Thomas R. Gingeras
X Zhao
Y Benjamini
Y Gilad
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background: A comparison of transcriptional profiles derived from different tissues in a given species or among different species assumes that commonalities reflect evolutionarily conserved programs and that differences reflect species or tissue responses to environmental conditions or developmental program staging. Apparently conflicting results have been published regarding whether organ-specific transcriptional patterns dominate over species-specific patterns, or vice versa, making it unclear to what extent the biology of a given organism can be extrapolated to another. These studies have in common that they treat the transcriptomes monolithically, implicitly ignoring that each gene is likely to have a specific pattern of transcriptional variation across organs and species. Results: We use linear models to quantify this pattern. We find a continuum in the spectrum of expression variation: the expression of some genes varies considerably across species and little across organs, and simply reflects evolutionary distance. At the other extreme are genes whose expression varies considerably across organs and little across species; these genes are much more likely to be associated with diseases than are genes whose expression varies predominantly across species. Conclusions: Whether transcriptomes, when considered globally, cluster preferentially according to one component or the other may not be a property of the transcriptomes, but rather a consequence of the dominant behavior of a subset of genes. Therefore, the values of the components of the variance of expression for each gene could become a useful resource when planning, interpreting, and extrapolating experimental data from mouse to humans.This project was supported by awards U54HG007004 and U41HG007234 from the National Human Genome Research Institute of the National Institutes of Health, as well as from the Spanish Ministry of Economy and Competitiveness, Centro de Excelencia Severo Ochoa 2013–2017, SEV-2012-0208, and Programa de Ayudas FPI del Ministerio de Economia y Competitividad, BES-2012-055848. We would also like to acknowledge support from the European Research Council (ERC) under the European Union’s Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement 294653

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

UPF Digital Repository

ProdInra